On Statistical Significance of Signal* 
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A definition for the statistical significance of a signal in an experiment is proposed by establishing 
a correlation between the observed p— value and the normal distribution integral probability, which 
is suitable for both counting experiment and continuous test statistics. The explicit expressions to 
calculate the statistical significance for both cases are given. 
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I. INTRODUCTION 

The statistical significance of a signal in an exper- 
iment of particle physics is to quantify the degree of 
confidence that the observation in the experiment ei- 
ther confirm or disprove a null hypothesis Hq, in fa- 
vor of an alternative hypothesis Hi. Usually the H 
stands for a known or background processes, while 
the alternative hypothesis Hi stands for a new or a 
signal process plus background processes with respec- 
tive production cross section. This concept is very 
useful for usual measurements that one can have an 
intuitive estimation, to what extent one can believe 
the observed phenomena are due to backgrounds or 
a signal. It becomes crucial for the measurements 
which claim a new discovery or a new signal. As a 
convention in particle physics experiment, the "5cr" 
standard, namely the statistical significance S > 5 is 
required to define the sensitivity for discovery; while 
in the cases S > 3 (S > 2), one may claim that the 
observed signal has strong (weak) evidence. 

However, as pointed out in Ref. pj, the concept of 
the statistical significance has not been employed con- 
sistently in the most important discoveries made over 
the last quarter century. Also, the definitions of the 
statistical significance in different measurements differ 
from each other. Listed below are various definitions 
for the statistical significance in counting experiment 
(see, for example, refs. @ 01 B) : 



Si = (n- b)/Vb, 
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where n is the total number of the observed events, 
which is the Poisson variable with the expectation 
s + b, s is the expected number of signal events to 
be searched, while b is the known expected number 
of Poisson distributed background events. All num- 
bers are counted in the "signal region" where the 
searched signal events are supposed to appear. In 
equation (4) and (5), the k(a) is a factor related to a 
that the corresponding statistical significance assumes 
1 — a acceptance for positive decision about signal ob- 
servation, and fc(0.5) = 0, fc(0.25) = 0.66,fe(0.1) = 
1.28, fe(0.05) = 1.64, etc 0. In equation (6), N(0,1) 
is a notation for the normal function with the expec- 
tation and variance equal to and 1, respectively. On 
the other hand, the measurements in particle physics 
often examine statistical variables that are continu- 
ous in nature. Actually, to identify a sample of events 
enriched in the signal process, it is often important 
to take into account the entire distribution of a given 
variable for a set of events , rather than just to count 
the events within a given signal region of values. In 
this situation, I. Nasky |4| gives a definition of the 
statistical significance via likelihood function 
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y/-2ln.L(b)/L(s + b) 



(7) 



under the assumption that —2ln L(b) / L(s + b) dis- 
tributes as x 2 function with degree of freedom of 1. 

Upon above situation, it is clear that we desire to 
have a self-consistent definition for statistical signif- 
icance, which can avoid the danger that the same 
S value in different measurements may imply virtu- 
ally different statistical significance, and can be suit- 
able for both counting experiment and continuous test 
statistics. In this letter we propose a definition of the 
statistical significance, which could be more close to 
the desired property stated above. 



5*512 = 2512 - &(<*), 



(5) 



II. DEFINITION OF THE STATISTICAL 
SIGNIFICANCE 
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The p— value is defined to quantify the level of 
agreement between the experimental data and a hy- 
pothesis Ref. P, . Assume an experiment makes a 



2 



measurement for test statistic t being equal to t obs , 
and t has a probability density function g(t\Ho) if a 
null hypothesis Hq is true. We futher assume that 
large t values correspond to poor agreement between 
the data and the null hypothesis Ho, then the p— value 
of an experiment would be 

/>oo 

p{t bs) = P(t > t obs \H Q ) = g(t\H )dt. (8) 

Jt obs 

A very small p— value tends to reject the null hypoth- 
esis Hq. 

Since the p— value of an experiment provides a mea- 
sure of the consistency between the Ho hypothesis and 
the measurement, our definition for statistical signifi- 
cance S relates with the p— value in the form of 




N(0,l)dx=l-p(t obs ) (9) 



under the assumption that the null hypothesis Hq 
represents that the observed events can be described 
merely by background processes. Because a small 
p— value means a small probability of Hq being true, 
corresponds to a large probability of H\ being true, 
one would get a large signal significance S for a small 
p— value, and vice versa. The left side of equation 
(9) represents the integral probability of the normal 
distribution in the region within ±S standard devia- 
tion (±S a), therefore, this definition conforms itself to 
the meaning of that the statistical significance should 
have. In such a definition, some correlated S and 
p— values are listed in Table [i] 

TABLE I: Statistical Significance S and correlated 
p— value. 



s 


p— value 


1 


0.3173 


2 


0.0455 


3 


0.0027 


4 


6.3 x 10 -5 


5 


5.7 x 10" 7 


6 


2.0 x 10" 9 



III. STATISTICAL SIGNIFICANCE IN 
COUNTING EXPERIMENT 

A group of particle physics experiment involves the 
search for new phenomena or signal by observing a 
unique class of events that can not be described by 
background processes. One can address this problem 
to that of a " counting experiment" , where one identi- 
fies a class of events using well-defined criteria, counts 
up the number of observed events, and estimates the 
average rate of events contributed by various back- 
grounds in the signal region, where the signal events (if 
exist) will be clustered. Assume in an experiment, the 



number of signal events in the signal region is a Pois- 
son variable with the expectation s, while the num- 
ber of events from backgrounds is a Poisson variable 
with a known expectation b without error, then the 
observed number of events distributes as the Poisson 
variable with the expectation s + b. If the experi- 
ment observed n obs events in the signal region, then 
the p— value is 

p{n obs ) = P{n>n obs \H )= V — e" 6 (10) 

* — ' IV. 

n—n b s 

n obs -l , 

^ n! 

Substituting this relation to equation (9), one imme- 
diately has 

/ N(0,l)dx= ]T -e- b . (11) 
J-s tl> n - 

Then, the signal significance S can be easily deter- 
mined. Comparing this equation with equation (6) 
given by Ref. Q , we notice the lower limit of the in- 
tegral is different. 

IV. STATISTICAL SIGNIFICANCE IN 
CONTINUOUS TEST STATISTICS 

The general problem in this situation can be ad- 
dressed as follows. Suppose we identify a class of 
events using well-defined criteria, which are charac- 
terized by a set of N observations Xi, X2, , Xpj for a 
random variable X . In addition, one has a hypothesis 
to test that predicts the probability density function 
of X, say f(X\8), where 9 = (9%, 6*2, 6k) is a set of 
parameters which need to be estimated from the data. 
Then the problem is to define a statistic that gives a 
measure of the consistency between the distribution 
of data and the distribution given by the hypothesis. 

To be concrete, we consider the random variable 
X is, say, an invariant mass, and the N observations 
X\, X%, Xn give an experimental distribution of X. 
Assuming parameters 6 = (#i,#2, —,9k) = {6 s ;6b), 
where 6 S and 8 b represent the parameters related to 
signal (say, a resonance) and backgrounds contribu- 
tion, respectively. We assume the null hypothesis Hq 
stands for that the experimental distribution of X 
can be described merely by the background processes, 
while the alternative hypothesis H% stands for that the 
experimental distribution of X should be described by 
the backgrounds plus signal; namely, the null hypoth- 
esis H specifies fixed value(s) for a subset of param- 
eters 8 S (the number of fixed parameter(s) is denoted 
as r), while the alternative hypothesis Hi leaves the 
r parameter(s) free to take any value(s) other than 
those specified in Ho. Therefore, the parameters 6 
are restricted to lie in a subspace u> of its total space 
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f2. On the basis of a data sample of size N from 
f(X\9) we want to test the hypothesis Hq : 6 belongs 
to uj. Given the observations Xi, X2, , Xn, the like- 
lihood function is L = ]X =1 f(Xi\0)- The maximum 
of this function over the total space fl is denoted by 
L((Y)i while within the subspace u> the maximum of 
the likelihood function is denoted by L(oj), then we 
define the likelihood-ratio A = L(u>)/L(Cl). It can be 
shown that for Hq true, the statistic 

t=-2ln\ = 2(lnL max (s + b)-]jaL max (b)) (12) 

is distributed as x 2 ( r ) when the sample size N is 
large [fj. In equation (12) we use L max (s + b) and 
L m ax{b) denoting L(f2) and L(Cj), respectively. If A 
turns out to be in the neighborhood of 1, the null hy- 
pothesis Ho is such that it renders L{ui) close to the 
maximum L(Q), and hence Hq will have a large prob- 
ability of being true. On the other hand, a small value 
of A will indicates that Hq is unlikely. Therefore, the 
critical region of A is in the neighborhood of 0, corre- 
sponding to large value of statistic t. If the measured 
value of t in an experiment is t obs , from equation (8) 
we have p— value 



which is identical to the equation (7) given by Ref. 



P(to 



bs ) 



x 2 (t;r)dt. 



(13) 



Therefore, in terms of equation (9), one can calculate 
the signal significance according to following expres- 
sion: 

s t 
N(0,l)dx=l-p(t obs )= [ °" B X 2 (t;r)dt. (14) 
i-s Jo 

For the case of r = 1, we have 



V. DISCUSSION AND SUMMARY 



In section 2, the p— value defined by equation (8) 
is based on the assumption that large t values corre- 
spond to poor agreement between the null hypothesis 
Hq and the observed data, namely, the critical region 
of statistic t for Ho lies on the upper side of its distri- 
bution. If the situation is such that the critical region 
of statistic t lies on the lower side of its distribution, 
then equation (8) should be replaced by 



p{tobs) = P{t < t obs \H a ) 



g(t\H a )dt, (16) 



and the definition of statistical significance S ex- 
pressed by equation (9) is still applicable. For the 
case that the critical region of statistic t for Ho lies on 
both lower and upper tails of its distribution, and one 
determined from an experiment the observed t values 
in both sides: t% bs and t^ bs , then equation (8) should 
be replaced by 



p(t obs ) = P(t < t L obs \Ho) + P(t > t u obs \Ho) (17) 



g{t\H )dt 



,{t\Ho)dt. 



N{o,i)dx = [ x 2 (t;i)dt 

-s Jo 

= 2 [ ° b " N(0,l)dx, 
Jo 



and immediately obtain 




(15) 



; (s + 6)-lnL maa (6))] 1 / 2 , 



In summary, we proposed a definition for the sta- 
tistical significance by establishing a correlation be- 
tween the normal distribution integral probability and 
the p— value observed in an experiment, which is suit- 
able for both counting experiment and continuous test 
statistics. The explicit expressions to calculate the 
statistical significance for counting experiment and 
continuous test statistics in terms of the Poisson prob- 
ability and likelihood-ratio are given. 
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